Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Neural Network Coupled Model for Conversion and Exploitation of Heterogeneous Lexical Annotations
HUANG Depeng, LI Zhenghua, GONG Chen, ZHANG Min
Acta Scientiarum Naturalium Universitatis Pekinensis    2020, 56 (1): 97-104.   DOI: 10.13209/j.0479-8023.2019.098
Abstract800)   HTML    PDF(pc) (673KB)(119)       Save
In order to expand the scale of manual annotated data and thereby improve model performance, we attempt to make full use of existing heterogeneous annotations to learn model parameters. We extend coupled sequence labeling model proposed by Li et al. (2015) under the BiLSTM-based deep learning framework. The neural coupled model learn its parameters directly on two heterogeneous training data, and predicts two optimal sequences simultaneously during the test phase. A lot of experiments have been conducted on the part-of-speech (POS) tagging task and the joint word segmentation and POS (WS&POS) tagging task. The results show that neural coupled approach is superior to other methods for exploiting heterogeneous lexical data, including the multi-task learning method and the traditional discrete-feature coupled model. Neural coupled model achieves higher performance on both scenarios, i.e., annotation conversion and boost the final target-side tagging accuracy by exploiting heterogeneous data.
Related Articles | Metrics | Comments0
Syntax-Enhanced UCCA Semantic Parsing
JIANG Wei, LI Zhenghua, ZHANG Min
Acta Scientiarum Naturalium Universitatis Pekinensis    2020, 56 (1): 89-96.   DOI: 10.13209/j.0479-8023.2019.099
Abstract1147)   HTML    PDF(pc) (644KB)(126)       Save
Considering the close correlation between syntactic and semantic structures, this paper attempts to add syntactic information into the universal conceptual cognitive annotation (UCCA) semantic parsing model to enhance the performance of semantic parsing. Based on the state-of-the-art graph-based UCCA semantic parser, we propose and compare four different approaches for incorporating syntactic information. Experiments are conducted on the English benchmark dataset for the semantic parsing shared task of the SemEval-2019 conference. The results on both the in-domain and out-domain evaluation data show that syntax-enhanced methods can achieve significant improvements of UCCA parsing. After utilizing BERT, syntactic information is still beneficial to some extent.
Related Articles | Metrics | Comments0
Hypernym Relation Classification Based on Word Pattern
SUN Jiawei, LI Zhenghua, CHEN Wenliang, ZHANG Min
Acta Scientiarum Naturalium Universitatis Pekinensis    2019, 55 (1): 1-7.   DOI: 10.13209/j.0479-8023.2018.055
Abstract1347)   HTML    PDF(pc) (4709KB)(315)       Save

The authors propose a hypernym relation classification method based on word pattern, which can effectively alleviate the sparsity problem suffered by the traditional path-based method. Furthermore, this paper makes an effective combination of the path-based method and the distributional method via word pattern embedding. To demonstrate the effectiveness of the proposed approach, the authors manually annotated a Chinese hypernym dataset containing 12000 word pairs. The experimental results show that the proposed word pattern embedding approach is effective and can achieve an F1 score of 95.36%.

Related Articles | Metrics | Comments0
Conversion of Multiple Resources for POS Tagging
GAO Enting,CHAO Jiayuan,LI Zhenghua
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract972)      PDF(pc) (568KB)(215)       Save
The authors propose an annotation conversion method using multiple resources for POS tagging, aiming to convert the source-side annotations into target-side and then combine the data to get larger training data. Two innovate strategies are proposed. The first strategy uses reliability information of guide features. The second strategy uses ambiguous labelings to improve the quality of converted data. Results demonstrate that the first strategy is helpful for annotation conversion while the second does little to conversion.
Related Articles | Metrics | Comments0